Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Lazy import builders, embedders, (chat)generators, retrievers #8655

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

julian-risch
Copy link
Member

@julian-risch julian-risch commented Dec 17, 2024

Related Issues

Proposed Changes:

  • Add lazy imports for builders, embedders, (chat)generators, retrievers

How did you test it?

Notes for the reviewer

I just added the lazy import for a small selection of components in this draft. All components used in our "first RAG pipeline" tutorial are covered, which comes down to:

from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.embedders import SentenceTransformersTextEmbedder
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
from datasets import load_dataset
from haystack import Document
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
from haystack.components.builders import ChatPromptBuilder, AnswerBuilder
from haystack.dataclasses import ChatMessage
import os
from getpass import getpass
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack import Pipeline

Tests are failing with TypeErrors. We also need to check whether type hints or auto completion in IDEs are affected in a bad way by the lazy import. For example, when add an AnswerBuilder, I get two suggestions about importing either the class or the function AnswerBuilder
Screenshot 2024-12-17 at 17 41 05

Here is a comparison first without and then with lazy imports. Created with

 python -X importtime 27_first_rag_pipeline.py 2> lazy-import.log
 tuna lazy-import.log

without lazy imports
Screenshot 2024-12-17 at 18 16 02
with lazy imports
Screenshot 2024-12-17 at 18 16 13

An alternative to lazy imports would be to require users to know and always use the full path of the component they want to import.
from haystack.components.generators.hugging_face_local import HuggingFaceLocalGenerator for example, instead of just from haystack.components.generators import HuggingFaceLocalGenerator

Checklist

  • I have read the contributors guidelines and the code of conduct
  • I have updated the related issue with new insights and changes
  • I added unit tests and updated the docstrings
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
  • I documented my code
  • I ran pre-commit hooks and fixed any issue

@github-actions github-actions bot added the type:documentation Improvements on the docs label Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Importing one component of a certain family/module leads to importing all components of the same family/module
1 participant